Hey everyone! 👋
Another week, another update on my GSoC journey with ProLIF!
📝 Current Pull Request Status
The primary focus this week was developing a robust test suite for the residue placement feature. This process revealed some edge cases with overlaps, which led to further refinements in the placement algorithm.
Algorithm Refinements
Here are the main improvements to address overlap edge cases:
Increased maximum iterations: The maximum number of iterations to check for overlaps has been raised from 20 to 100. The loop can, however, break early if an iteration finds no overlaps. A single iteration checks for overlaps among all residues and between residues and ligand atoms.
Increased force for residue-residue overlaps: If the iteration count exceeds 50, the repulsive force to separate overlapping residues is increased by a factor of 1.6.
= (min_distance - dist) * 0.5 force if itr >= 50: *= 1.6 force
Increased force for residue-ligand overlaps: Similarly, if the iteration count exceeds 50, the repulsive force to separate an overlapping residue from a ligand atom is increased by a factor of 4/3.
= (min_distance - dist) * 1.5 force if itr >= 50: *= 4/3 force
The Test File
To ensure the residue placement algorithm is robust, I developed a new test file using pytest
. The main goal is to verify that the generated coordinates for residues are free of overlaps.
Here’s how the test works:
Setting up a complex scenario: The test uses a fixture to create a
pandas.DataFrame
with multiple residues and interactions designed to be prone to overlaps. This creates a challenging test case for the algorithm.Generating coordinates: A
LigNetwork
object is created with the test data, and it generates the coordinates for all residues and ligand atoms.Defining a minimum distance: A dynamic
min_distance
threshold is calculated based on the ligand’s dimensions. This ensures that residues are not placed too close to each other or to the ligand.python min_distance: float = min(100, max(width, height) * 0.3)
Checking for overlaps: The test then performs two key checks:
- Residue-residue overlaps: It iterates through all pairs of protein residues and asserts that the distance between them is greater than or equal to
min_distance
. - Residue-ligand overlaps: It checks the distance between each residue and every ligand atom, ensuring it also meets the
min_distance
requirement.
- Residue-residue overlaps: It iterates through all pairs of protein residues and asserts that the distance between them is greater than or equal to
This test helps confirm that the overlap resolution logic works correctly, even in crowded interaction environments.
📆 What’s Next?
I’ll focus on updating the documentation and tutorials to include information about the new changes.
Thanks for reading and keeping up with my progress! More updates soon as we drive these visuals and tests forward. 🧬📊